Constant news streams and social media have heightened our awareness of criminal activity whilst making it harder for us to accurately gauge actual safety because of frequent exaggeration. Even though we live in times of historically low crime, compared to recent years rates are going up in many countries. In the US, with a 4.2% increase in overall violent crimes from January to June 2022 as compared to the same period the previous year, seeing an explosive 12% increase in robberies and a 3% increase in cases of aggravated assault (Contreras, 2022).
Currently, data visualisation is being used by law enforcement agencies across the world to better monitor and prevent crimes. For example, the police in North America have been using systems showing real-time locations together with crimes, traffic, and weather data to guide crime intervention. However, while the authorities have access to such data on crime rates in different locations, this data has not been communicated properly to the general public to improve their safety. As such, residents and tourists lack access to accurate historical information on crime rates in general, instead relying on less robust information sources. These include anecdotal information, local news, or personal judgement of whether the area appears shady or not. This uncertainty causes many, especially women, to tend to just avoid going out late at night or during times that are generally considered more dangerous. However, this causes inconvenience due to the possibility of false alarms or outdated information.
Users also currently face pain points due to the lack of a well-substantiated directory on crime hotspots. Without comprehensive data, many are unable to plan such as by planning their route to avoid unsafe timings and areas. Furthermore, people are unable to check whether their planned route has potential dangers and are instead left with worries and concerns when travelling around in an unsafe area that they can no longer avoid at the time. For travellers, such research is especially time-consuming when they move across different cities frequently (e.g. backpackers) and takes much of the joy out of their holidays by tying up time and increasing their anxiety levels.
As such, we seek to create an app that provides information to users allowing them to have a better judgement of the crime rates in a particular location. In this way, they are better able to navigate around the area and their lives. This is especially so for women who are travelling around by themselves or tourists who are new to the area.
This application was created with our target audience being the general public. Considering the pain points faced by them, this application was created to be a one-stop solution that contains all relevant crime information, such that they are better able to watch out for areas or profiles that they should take more caution around. Aside from providing all historical data such that users can view the big picture, we also aim to be a platform that offers sufficient and targeted information whereby users will be able to select attributes relevant to them to gain information that would be beneficial to them.
As a starting point, we chose to target one city - Los Angeles, with historically high crime rates of 57.59% and many types of publicly available datasets. It is also a popular tourist destination, with 29 million tourists a year. As such, the City of Angels is the perfect location to build up the foundation of our application. We chose 3 different types of datasets to scrape and compile, to present different types of relevant information to the user.
We obtained crime data from the city of Los Angeles provided by the Los Angeles Police Department: Crime Data from 2020 to Present
The dataset presents updated crime information from February 11, 2020 to the present, with each row of data representing a crime incident. It also contains in-depth information on each crime incident ranging from the dated occurrence of the crime, crime type, victim profile, location of the crime, and more. We were also able to access its API, such that our application would be automatically updated with new incoming data when this database is updated weekly.
Data Cleaning
To enable users to spot the nearest police station, we decided to add the location of police stations onto the crime map visualisation. To label the locations, we sourced the location information from Los Angeles GeoHub, a public platform for accessing location based data: LA County Sheriff and Police Stations. The dataset is maintained through the County Location Management System under a government organisation.
Data Cleaning
In order to increase usability and relevance of this application, we decided to include data on Los Angeles’s top tourist attractions, allowing users, especially tourists, to take note of crime hotspots near these attractions. To attain the geocodes of tourist attractions in Los Angeles, we obtained information from Google Maps which lists the top 65 tourist attractions in Los Angeles.
Data Cleaning
library(dplyr)
library(stringr)
library(xml2)
kml_points <- function(x, layer = "d1", verbose = TRUE) {
require(dplyr)
require(stringr)
require(xml2)
#' Extract Placemark fields.
#'
#' @param x A nodeset of Placemarks.
#' @param field The name of the field to extract, e.g. \code{"name"}.
#' @param layer The name of the layer to extract from; defaults to \code{"d1"}.
#' @return A character vector. Missing values, i.e. empty fields, will be
#' returned as \code{NA} values.
get_field <- function(x, field, layer = "d1") {
# vectorization required to get missing values when field is xml_missing
lapply(x, xml_find_first, str_c(layer, ":", field)) %>%
sapply(xml_text)
}
x <- read_xml(x) %>%
xml_find_all(str_c("//", layer, ":Point/.."))
x <- data_frame(
name = get_field(x, "name", layer),
description = get_field(x, "description", layer),
styleUrl = get_field(x, "styleUrl", layer),
coordinates = get_field(x, str_c("Point/", layer, ":coordinates"), layer)
)
x$longitude <- kml_coordinate(x$coordinates, 1, verbose)
x$latitude <- kml_coordinate(x$coordinates, 2, verbose)
x$altitude <- kml_coordinate(x$coordinates, 3, verbose)
return(select(x, -coordinates))
}
kml_coordinate <- function(x, coord, verbose = TRUE) {
require(stringr) # includes `%>%`
x <- str_replace(x, "(.*),(.*),(.*)", str_c("\\", coord)) %>%
as.numeric
if (verbose && coord == 1 && any(abs(x) > 180))
message("Some longitudes are not contained within [-180, 180].")
if (verbose && coord == 2 && any(abs(x) > 90))
message("Some latitudes are not contained within [-90, 90].")
if (verbose && coord == 3 && any(x < 0))
message("Some altitudes are below sea level.")
return(x)
}
attractions <- kml_points("Los Angeles map.kml")
Crimescope provides visualisation of up-to-date crime information and historical crime information for users to understand the crime situation in the region. The main features are introduced as followed:
The dashboard page gives the user an overview of the recent statistics from the dataset. The value boxes on top of the page show the total crimes this month, increase or decrease from the last month, and the most common crime type this month. These give the user instant understanding of the recent situation and trend of crime occurrence without interpretation.
Below are the line charts showing the trend of the number of crimes over months and days of the week. At the bottom, there is one ordered bar chart showing the number of crime occurrences for each area this month and one pie chart showing the constitution of crime types that occurred this month.
With such data shown in the dashboard, the user can then watch out for the most common crime types recently, avoid dangerous locations and know which days or months are generally safer.
The data shown here is mostly grouped by month because the dataset updates weekly or monthly and the data is missing for many of the days. Displaying data in days may not be clear in terms of illustration.
Crimes on a day-to-day basis are presented in a map view, available through a point map or heatmap. Specific dates can be selected for users to view all crimes from a certain day.
Within the point map, types of crime category can be viewed, with the default being all crimes shown. These crime categories are also colour-coded as indicated by the legend on the map. Key landmarks like attractions and police stations can also be selected to appear on the map. These have been marked out by icons on the map, with a yellow police badge for police stations, and a blue location marker for attractions. These key landmarks can provide tourists or residents information on possible crimes around any attractions in LA, or the nearest police station for certain locations.
On the other hand, the heatmap provides an overall view on the crime hotspots in LA, as denoted by areas in red.
The Crime Type tab features historical crime data presented in easy to read graphs for the user. We decided to present the information into 3 categories - Crime over Time, Top 20 Crimes, and the Proportion of Crime by Year, with each category having more tabs to specify the information based on different attributes.
Graphs for crime over time include crime over hour, day, hour and day, month and year. Different time periods are included as different tabs to allow the user to easily toggle between different options. While most graphs are line charts to visualise the data over time, a treemap was also included to visualise crime by hour and day.
Top twenty crimes in Los Angeles can be visualised in a bar chart, or a treemap, both available as tabs for users to toggle between.
The proportion of crime by year or by area can also be visualised. In the proportion of crime by year, types of crime have been colour-coded and compared side-by-side to show the similarities and differences in crime types by year. For the proportion of crime by area, an interactive treemap is available for users to view the area with the most crimes, being able to click into each area to see the breakdown of crime types and proportion by area as well.
The Victim Analysis tab allows users to investigate crimes targeted at a particular demographic group or even their demographic group. Users are allowed to select their desired age group, sex, and ethnicity of the victim to see what crimes are targeted at the group based on the historical crime data in the region.
For users who are interested in knowing the common weapons used in crimes, we created this feature to display the proportions of weapons used in different crimes on a selected day. When hovering the mouse over the bar, the user can see the specific type of weapon and its proportion.
There are certain limitations to our application that we would like to point out:
More functionalities could be added to Crimescope, such as incorporating the users’ current locations to give suggestions to better protect the users’ safety.
It also has the potential to scale horizontally by including datasets containing other parts of the United States or even beyond the country, enabling users to refer to the app when travelling outside of LA.
Furthermore, we can further adapt the app to accommodate more complex and detailed visualisation to suit different users’ needs, such as the police force or government institutions, which can use the app to study crime patterns and take precautions.
Predictive models can also be built based on the current crime data available. These include machine learning and predictive data mining.
Contreras, R. (2022, September 10). Survey: Homicides down midyear as overall violent crime jumps. Axios. https://www.axios.com/2022/09/10/homicides-down-midyear-overall-violent-crime-up
Helling, A. A. (2022, November 4). Is Los Angeles safe to visit in 2022?: Safety Concerns. Travellers. Retrieved November 12, 2022, from https://travellersworldwide.com/is-los-angeles-safe/
S. Lock, & 9, N. (2021, November 9). Los Angeles: Visitor count 2020. Statista. Retrieved November 12, 2022, from https://www.statista.com/statistics/977116/number-of-tourists-los-angeles-california/